Why is building a heap \(O(n)\) and not \(O(n \log n)\)? The work is not distributed evenly.

  • Most nodes in a complete binary tree are leaves at the bottom. The `bubbleDown` operation on a leaf does zero work.
  • About half the nodes (\(\approx n/2\)) are leaves.
  • About a quarter of the nodes (\(\approx n/4\)) are one level up. They do at most 1 swap.
  • The amount of work done per node increases as we get closer to the root, but the number of nodes at those levels decreases exponentially.
  • The total work is a sum that mathematically converges to be proportional to \(n\), not \(n \log n\).

Formal Analysis

The total work is the sum of the work done at each height \(h\): \[\sum_{h=0}^{\log n} \frac{n}{2^{h+1}} \cdot O(h) \approx n \sum_{h=0}^{\log n} \frac{h}{2^{h+1}} \approx O(n)\]

Initial state